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Abstract. In this paper we provide new bounds on classical and quantum distributional 
communication complexity in the two-party, one-way model of communication. 

In the classical one-way model, our bound extends the well known upper bound of Kremer, 
Nisan and Ron [KNR95] to include non-product distributions. Let e £ (0, 1/2) be a constant. 
We show that for a boolean function f : X x y {0, 1} and a non-product distribution fi on 

Di-^(/) = 0((/(X:y) + l).VC(/)), 

where Dl''^{f) represents the one-way distributional communication complexity of / with error 
at most e under fj,; VC(/) represents the Vapnik-Chervonenkis dimension of / and I{X : Y) 
represents the mutual information, under /i, between the random inputs of the two parties. 
For a non-boolean function f : X xy ^ {1, . . . ,k} (fc > 2 an integer), we show a similar upper 
bound on Dl'^{f) in terms of k, I{X : Y) and the pseudo-dimension of /' ^, a generalization 
of the VC-dimension for non-boolean functions. 

In the quantum one-way model we provide a lower bound on the distributional communication 
complexity, under product distributions, of a function /, in terms the well studied complexity 
measure of / referred to as the rectangle bound or the corruption bound of /. We show for a 
non-boolean total function f : X xy Z and a product distribution n on X xy, 

where Q^3yg(/) represents the quantum one-way distributional communication complexity of 
/ with error at most under and recl''^{f) represents the one-way rectangle bound of / 
with error at most e under fi. Similarly for a non-boolean partial function f : X xy ^ Zu{*} 
and a product distribution fi on X x y, we show. 
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1 Introduction 



Communication complexity studies the minimum amount of communication that two or more parties 
need to compute a given function or a relation of their inputs. Since its inception in the seminal paper 
by Yao [Yao79] , communication complexity has been an important and widely studied research area. 
This is the case both because of the interesting and intriguing mathematics involved in its study, and 
also because of the fundamental connections it bears with many other areas in theoretical computer 
science, such as data structures, streaming algorithms, circuit lower bounds, decision tree complexity, 
VLSI designs, etc. 

Different models of communication have boon proposed and studied. In the basic and standard 
two-party interactive model, two parties say Alice and Bob, each receive an input say x € X and 
y G y, respectively. They interact with each other possibly communicating several messages in order 
to jointly compute, say a given fimction f(x,y) of their inputs. If only one message is allowed, say 
from Alice to Bob, and Bob outputs f{x, y) without any further interaction with Alice, then the model 
is said to be one-way. Though seemingly simple, this model has numerous nontrivial questions as well 
as applications to other areas such as lower bounds of streaming algorithms, sec for example [Mut05]. 
Other models like the Simultaneous message passing (SMP) model, and multi-party models are also 
studied. We refer readers to the textbook [KN97] for a comprehensive introduction to the field 
of classical communication complexity. In 1993, Yao [Yao93] introduced quantum communication 
complexity and since then it has also become a very active and vibrant area of research. In the 
quantum communication models, the parties are allowed to use quantum computers to process their 
inputs and to use quantum channels to send messages. 

In this paper we arc primarily concerned with the one-way model and we assume that the 
single message is always, say from Alice to Bob. Let us first briefly discuss a few classical models. 
In the deterministic one-way model, the parties act in a deterministic fashion, and compute / 
correctly on all input pairs {x,y). The minimum communication required for accomplishing this 
is called as the deterministic complexity of / and is denoted by D^(/). Allowing the parties to 
use randomness and allowing them to err on their inputs with a small non-zero probability, often 
results in considerable savings in communication. The communication of the best public-coin one-way 
protocol that has error at most e on all inputs, is referred to as the one-way public-coin randomized 
communication complexity of / and is denoted by Rj'''"''(/). Similarly we can define the one-way 
private-coin randomized communication complexity of /, denoted by Rj(/) and in the quantum 
model, the one-way quantum communication complexity of /, denoted by Ql{f)- Please refer to 
Sec. 2.2 for explicit definitions. When the subscript is omitted, e is assumed to be 1/3. 

Sometimes the requirement on communication protocols is less stringent and it is only required 
that the average error, under a given distribution fi on the inputs, is small. The communication of the 
best one-way classical protocol that has average error at most e under /i, is referred to as the one-way 
distributional communication complexity of / and is denoted by D^''' (/). We can define the one-way 
distributional quantum communication complexity Ql''^{f) in a similar way. A useful connection 
between the public-coin randomized and distributional communication complexities via the Yao's 
Principal [Yao77] states that for a given e e (0,1/2), Rl''"'^{f) = max^ Dj'^(/). A distribution ii, 
that achieves the maximum in Yao's Principal, that is for which R^'P"''(/) = D,^''^(/), is referred to 
as a hard distribution for /. This principal also holds in many other models and allows for a good 
handle on the public-coin randomized complexity in scenarios where the; distributional complexity 
is much easier to understand. Often, the distributional complexity when the inputs of Alice and Bob 
are drawn independently from a product distribution, is easier to understand. Nonetheless, often as 
is the case with several important functions like Set Disjointness (DISJ) and Inner Product (IP), the 
maximum in Yao's Principal, in the one-way model, occurs for a product distribution, and hence it 
paves the way for understanding the public-coin randomized complexity. 

Let us now discuss our first main result which is in the classical one-way model. We ask the 
reader to refer to Sec. 2 for the definitions of various quantities involved in the discussion below. 



1.1 Classical upper bound 



For a boolean function / : A" x 3^ ^ {0, 1}, its Vapnik-Chervonenkis (VC) dimension, denoted by 
\/C{f), is an important complexity measure, widely studied specially in the contexts of computational 
learning theory. Kremer, Nisan and Ron [KNR95, Thm. 3.2] found a beautiful connection between 
the distributional complexity of / under product distributions on X x y, and VC(/), as follows. 

Theorem 1 ([KNR95]). Let f : X x y ^ {0, 1} be a boolean function and let e G (0, 1/2) be a 
constant. Let n be a product distribution on X x y . There is a universal constant k such that, 

Dl'^if) < K-UogyVC{f). (1) 

Note that such a relation cannot hold for non-product distributions /i since otherwise it would 
translate, via the Yao's Principal, into R^'''"'^(/) = 0(VC(/)), for all boolean /. This is not true as 
is exhibited by several functions for example the Greater Than (GT„) function, in which Alice and 
Bob need to determine which of their n-bit inputs is bigger. For this function, R^'P"'^(GT„) = 0{n) 
but VC(GT„) = 1. Nonetheless for these functions, any hard distribution n, is highly correlated 
between X and y. Therefore it is conceivable that such a relationship, as in Eq. 1, could still hold, 
possibly after taking into account the amount of correlation in a given non-product distribution. 
This question, although probably never explicitly asked in any previous work, appears to be quite 
fundamental. We answer it in the positive by the following. 

Theorem 2. Let / : A" x 3^ ^ {0, 1} be a boolean function and let e e (0, 1/2) be a constant. Let 
fjL be a distribution (possibly non-product) on X x y. Let XY be joint random variables distributed 
according to fi. There is a universal constant k such that, 

Dl'>^{f) < «.ilogi- Q-/(X:y) + l) .VC(/) 

In particular, for constant e, 

Di-'^(/) = 0((7(X:F) + 1).VC(/)) 

Above I{X : Y) represents the mutual information between correlated random variables X and Y , 
distributed according to fi. 

Let us discuss below a few aspects of this result and its relationship with what is previously known. 

Note that in combination with Yao's Principal, Thm. 2 gives us the following (where the mutual 
information is now considered under a hard distribution for /). 

Ri.P"b(/) = o mX : F) + 1) • VC(/)) . (2) 

1. It is easily observed using Sauer's Lemma (Lem. 2, Sec. 2.) that the deterministic complexity of 
/ has 

Di(/) = 0(VC(/). log 13^1). (3) 

This is because Alice can simply tell the name of fx in 0(VC(/) -log |3^|) bits since < |3^|^'-*^-''). 
Now our result (2) is on one hand stronger than (3) in the sense I{X : Y) < log \y\ always, and 
I{X : Y) could be much smaller than log |3^| depending on /i. An example of such a case is the 
Inner Product (IPn) function in which Alice and Bob need to determine the inner product (mod 
2) of their n-bit input strings. For IP„, a hard distribution is the uniform distribution which 
is product, and hence I{X : Y) = 0, whereas log |3^| = n. However on the other hand (2) is 
also weaker than (3) in the sense it only upper bounds the public-coin randomized complexity, 
whereas (2) upper bounds the deterministic complexity of /. 

2. Aaronson [Aar07] shows that for a total or partial boolean function /, 



R\f) = 0{Q\f)-log\y\). 



(4) 



Again (2) is stronger than (4) in the sense that I{X : Y) could be much smaher than log |3^| 
depending on fj,. Also it is known that, Q^(/) = /?(VC(/)) always, following from Nayak [Nay99], 
and Q^{f) could be much larger than VC(/). An example is the Greater Than (GT„) function 
for which Q^(GT„) = i7(n), whereas VC(GT„) = 0(1). On the other hand (2) only holds for 
total boolean functions whereas (4) also holds for partial boolean functions. 

3. As mentioned before, for all total boolean fimctions /, R^'P"^{f) = i7(VC(/)), and R^'P"^{f) 
could be much larger than VC(/) (as in function GTn). Now Eq. (2) says that in the latter case, 
the mutual information I{X : Y) under any hard distribution /x must be large. That is, a hard 
distribution fi must be highly correlated. 

4. It is known that for total boolean functions /, for which a hard distribution is product, there is 
no separation between the one-way public-coin randomized and quantum communication com- 
plexities. Now our theorem gives a smooth extension of this fact to the functions whose hard 
distributions are not product ones. 

A generalization of the VC-dimension for non-boolean functions, is referred to as the pseudo- 
dimension (Def. 2, Sec. 2). For a non-boolean function / : x 3^ — > {1, . . . , fc} (fc > 2 an integer), we 

show a similar upper bound on Dl'^{f) in terms of k, I{X : Y) and the pseudo-dimension of /' *= 

Theorem 3. Let k > 2 be an integer. Let f : X xy ^ {1, .... k} and e E (0, 1/6) be a constant. Let 
f : X X y ^ [0, 1] be such that f'{x,y) = f{x,y)/k. Let ji be a distribution (possibly non-product) 
on X X y, and XY be joint random variables distributed according to ji. Then there is a universal 
constant k such that, 

1,4/1 

Dlfif) < K • ^ • (log - + dlog^ — j • {I{X : Y) + log k) 
where d'^=V ^2 (f) is the -pseudo-dimension of f. 

Let us now discuss our other main result which we show in the quantum one-way model. 
1.2 Quantum lower bound 

For a function f : X xy ^ Z, a, measure of its complexity that is often very useful in understanding 
its classical randomized communication complexity, is what is referred to as the rectangle bound 
(denoted by rec(/)), also often known as the corruption bound. The rectangle bound rec(/) is actually 
defined first via a distributional version rec''(/). It is a well studied measure and rec^(/) is well 
known to form a lower bound on D^(/) both in the one-way and two-way models. In fact, in a 
celebrated result, Razborov [Raz92] provided optimal lower bound on the randomized communication 
complexity of the Set Disjointness function, by arguing a lower bound on its rectangle bound. 

It is natural to ask if this measure also forms a lower bound on the quantum communication 
complexity. We answer in the positive for this question in the one-way model. We show that, for 
a total or partial function, the quantum distributional one-way communication complexity under a 
given product distribution fi is lower bounded by the corresponding one-way rectangle bound. Our 
precise result is as follows. 

Theorem 4. Let f : X x y ^ Z be a total function and let e G (0, 1/2) be a constant. Let ^ be a 
product distribution on X x y and let recl'^{f) > 2 • log(l/e). Then, 

Qli%if) > ^ • (1 - 2e) • (5(e/2) - 5(e/4)) • (Lrec^''(/)J - 1) = n{recl'^{f)), (5) 

where for p G (0, 1), S{p) is the binary entropy function S(p) —plogp — {1 — p) log(l — p). 
If f : X X y ^ ZU {*} is a partial function then, 

Qy/(2.i54f) > r^^-^^y^- (Lrec.^'"(/)J - 1) = ^(^eci''^(/)). 



Let us make a few important remarks here related to this result. 

1. Recently, Jain, Klauck and Nayak [JKN08] showed that for any relation f C X x y x Z, the 
rectangle bound of / tightly characterizes the randomized one-way classical communication 

complexity of /. 

Theorem 5 ([JKN08]). Let f C X x y x 2 be a relation and let e e (0, 1/2). Then, 

R^P"^/) = 0(reci(/)). 

While showing Thm. 5, Jain, Klauck and Nayak [JKN08] have shown that for all relations 
f : X X y ^ Z and for all distributions /z (product and non-product) on X x y-, Dl'^^{f) = 
i7(rec4^^(/)). However in the quantum setting wc arc making a similar statement only for (total 
or partial) functions / and only for product distributions fj, on X x y . In fact it does NOT hold 
if we let n to be non-product. It can be shown that there is a total function / and a non- product 
distribution /i such that Ql'^^if) is exponentially smaller than recl''^{f). This fact is implicit 
in the work of Gavinsky et al. [GKK+07]. We make an explicit statement of this in Sec. A. in 
Appendix and skip its proof for brevity. 

2. Let e e (0, 1 /4). Jain, Klauck and Nayak [JKN08] have shown that for all relations g C XxyxZ, 

Rl'Hg) = Oirecl'Hg)). 

Here the superscript [] represents maximization over all product distributions. From Thm. 4 for 
a (total or partial) function / we get, 

Q'a5(2.i54)(/) = ^(rec^[l(/)). 

Since Rl'^\f) > Qe '"(/), combining everything we get, 

Theorem 6. Let e G (0, 1/4). Let f : X x y ^ ZU {*} be a (possibly partial and non-boolean) 
function. Then 

R'e;(2.i54)(/) > Qee!(2.l5.)(/) = ^(R2;"(./))- 

It was known earlier that for total boolean functions, Q^'"(/) is tightly bounded by R^'[l(/). We 
extend such a relationship here to apply for non-boolean (partial) functions as well. We remark 
that the earlier proofs for total boolean functions used the VC-dimension result, Thm. 1, of 
Kremer, Nisan and Ron [KNR95]. We get the same result here without requiring it. 

We finally present an application of our result Thm. 4 in the context of studying security of extrac- 
tors against quantum adversaries. An extractor is a function that is used to extract almost uniform 
randomness from a source of imperfect randomness. As very well studied objects, extractors have 
found several uses in many cryptographic applications and also in complexity theory. Recently, secu- 
rity of various extractors has been increasingly studied in the presence of quantum adversaries; since 
such secure extractors are then useful in several applications such as privacy amplification in quan- 
tum key distribution and key-expansion in quantum bounded storage models [KMR05,KR05,KT08]. 
In particular, Konig and Tcrhal [KT08] have shown that any boolean extractor that can extract 
a uniform bit from sources of min- entropy k is also secure against quantum adversaries with their 
memory bounded by a function of k. 

Wc get a similar statement for boolean extractors, as a corollary of our result Thm. 4. We obtain 
this corollary by observing a key connection between the minimum min-entropy that an extractor 
function / needs to extract a uniform bit and its rectangle bound. The precise statement of our result, 
its relationship with the result of [KT08], and other detailed discussions are deferred to Sec. 5. 

1.3 Orgemization 

In the following Sec. 2 we discuss various information theoretic preliminaries and the model of one- 
way communication. In Sec. 3 we present the upper bounds in the classical setting. In the following 
Sec. 4 we present the lower bounds in the quantum setting. The application concerning extractors 
is discussed in Sec. 5. We finally conclude with some open questions in Sec. 6. 



2 Preliminaries 



2.1 Information theory 

In this section we present some information theoretic notations, definitions and facts that we use 
in the rest of the paper. For an introduction to classical and quantum information theory, we refer 
the reader to the texts by Cover and Thomas [CT91] and Nielsen and Chuang [NCOO] respectively. 
Most of the facts stated in this section without proofs may be found in these books. 

All logarithms in this paper are taken with base 2, unless otherwise specified. For an integer 
t > 1, [t] represents the set {1,. . . ,t}. For square matrices P,Q, hy Q > P we mean that Q — P is 

positive semi-definite. For a matrix A, \\A\\^ Tr{V^MA) denotes its £i norm. For p G (0, 1), let 

def 

^{p) = "Plogp— (1 ^p) log(l —p), denote the binary entropy function. We have the following fact. 

Fact 1 ForSe [0, 1/2], ^(i + <5) < 1 - 25^ and S{S) < 2VS. 

A quantum state, usually represented by letters p,a etc., is a positive semi-definite trace one 
operator in a given Hilbert space. Specializing from the quantum case, we view a discrete probability 
distribution P as a positive semi-definite trace one diagonal matrix indexed by its (finite) sample 
space. For a distribution P with support on sot X, and x X, P{^) denotes the (x, x) diagonal 

entry of P, and P{£) *= J2xeS -^(•^) denotes the probability of the event £ C X. A distribution P 
on X 3^ is said to be product across X and y, if it can be written as P = Px Py, where Px, Py 
are distributions on X.y respectively and (E) is the tensor operation. Often for product distributions 
we do not mention the sets across which it is product if it is clear from the context. 

Let X be a classical random variable (or simply random variable) taking values in X. For a 
random variable X, we also let X represent its probability distribution. The entropy of X denoted 

S{X) is defined to be S{X) = -TrXlogX. Since X is classical an equivalent definition would be 

S{X) — J2xex PA-^ = ^] logPr[X = x] . Let X, F be a correlated random variables taking values 
in X, y respectively. XY are said to be independent if their joint distribution is product. The mutual 
information between them, denoted I{X : Y) is defined to be I{X : Y) = S{X) + S{Y) - S{XY) 

and conditional entropy denoted S{X\Y) is defined to be S{X\Y) S{XY) — S{Y). It is easily seen 
that S{X\Y) = By^Y[S{X\iY = y)]. 
We have the following facts. 

Fact 2 For all random variables X,Y; I{X : Y) > 0; in other words S{X) + S{Y) > S{XY). If 
X, Y are independent then we have I{X : Y) = 0; in other words S{XY) = S{X) + S{Y). 

The definitions and facts stated in the above paragraph for classical random variables also hold 
mutatis mutandis for quantum states as well. For example for a quantum state p. its entropy is defined 

def 

as S{p) = — Trplogp. For brevity, we avoid making all the corresponding statements explicitly. As is 
the case with classical random variables, for a quantum system say Q, we also often let Q represent 
its quantum state. We have the following fact. 

Fact 3 Any quantum, state p in m-qubits has S{p) < m. Also let XQ be a joint classical- quantum 
system with X being a classical random variable, then I{X : Q) < imn{S{X), S{Q)}. 

For a system XYM, let us define I{X : M\Y) = S{X\Y) + S{M\Y) - S{XM\Y). If F is a 
classical system then it is easily seen that I{X : M\Y) = Ey<_v[/(X' : M\(Y = y))]. 

For random variables Xi, . . . , X„ and a correlated (possibly quantum) system M, we have the 
following chain rule of mutual information, which will be crucially used in our proofs. 

n 

7(Xi...X„:M) = 5^7(Xi :M|Xi...Xi_i) (6) 

i=l 

By convention, conditioning on Xi . . .Xi-i for i = 1 means conditioning on the true event. 

The following is an important information theoretic fact known as Fano's inequality, which relates 
the probability of disagreement for correlated random variables to their mutual information. 



Lemma 1 (Fano's inequality). Let X be a random variable taking values in X. Let Y be a 
correlated random variable and let Pe *= Pr(X ^ Y). Then, 



S{Pe) + Pe\og{\X\-l) > S{X\Y). 



The VC-dimension of a boolean function / is an important combinatorial concept and has close 
connections with the one-way communication complexity of /. 

Definition 1 (Vapnik-Chervonenkis (VC) dimension). A set S Q y is said to be shattered by 

a set Q of boolean functions from y to {0.1}, if^R Q S.Bgn G G such that Vs G S', (s G i?) <4> 
= !)• The largest value d for which there is a set S of size d that is shattered by Q is the 
Vapnik-Chervonenkis dimension of Q and is denoted by \/Q{Q). 

Let / : X 3^ — > {0, 1} be a boolean function. For all x & X let fx'-y^ {0, 1} be defined as 

fx{v) *= f{x,y),yy e y- Let T {fx : x € X}. Then the Vapnik-Chervonenkis dimension of f, 
denoted by VC(/), is defined to be VC(JF). 

Let / and be as defined in the above definition. We call a function / trivial iff = 1, in 
other words iff the value of the function, for all x, is determined only by y. We call / non-trivial iff 
it is not trivial. Note that a boolean / is non-trivial if and only if VC(/) > 1. Throughout this paper 
we assume all our functions to be non-trivial. 

Following is a useful fact, with several applications, relating the VC-dimension of / to the size of 
!F. It is usually attributed to Sauer [Sau72] , however it has been independently discovered by several 
different people as well. 



Lemma 2 (Sauer's Lemma [Sau72]). Let f : X x y ^ {0,1} be a boolean function. Let d = 
VC(/). Letm= \y\, then 



The following result from Blumer, Ehrenfeucht, Haussler, and Warmuth [BEHW89] is one of the 
most fundamental results from computational learning theory and in fact an important application 
of Sauer's Lemma. 

Lemma 3. Let H be class of boolean functions over a finite domain y with VC-dimension d, let w 
be an arbitrary probability distribution over y, and letO < e,6 <1. Let L be any algorithm that takes 
as input a set S G 3^™ of m examples labeled according to an unknown function h €: H, and outputs 
a hypothesis function h' £ H that is consistent with h on the sample S. If L receives a random 
sample of size m > mo(rf, e, 5) distributed according to Tr™, where 



for some constant co > 0, then with probability at least 1 — 6 over the random samples, Pr^[/i'(y) ^ 



A similar learning result also holds for non-boolean functions. For this let us first define the 
following generalization of the VC-dimension, known as the pseudo-dimension. 

Definition 2 (pseudo-dimension). A set S C y is said to be 7-shattcrcd by a set Q of functions 
from y to Z C R, if there exists a vector w = {wi, . . . ,Wk) & of dimension k = \S\ for which 
the following holds. For all R C S,3gR S G such that Vs G S,{s G R) ^ {9r{s) > -|- 7) and 
(s ^ -R) {gii{s) < Wi — 7). The largest value d for which there is a set S of size d that is 
^-shattered by Q is the 7-pscudo-dimcnsion of Q and is denoted by T'^{Q). 

def 

Let f : X xy ^ Z be a function. For all x G X let fx '■ y ^ Z be defined as fx {y) = f(x, y),yy G 
y. Let T {fx '■ X G X}. Then the ^-pseudo- dimension of f, denoted by V-y{f), is defined to be 




moid, e, 5) = Co - log - + - log - 




e d e e 




h{y)] < e. 



Following result of Bartlett, Long and Williamson [BLW96] is similar to the learning lemma of 
Blumer et al. [BEHW89] and concerns non-boolean functions. 

Theorem 7. Let Q be a class of functions over a finite domain y into the range [0, 1]. Let w be an 

arbitrary probability distribution over y and let e e (0, 1/2) and S e (0, 1). Let d /57q{G)- Then 

there exists a deterministic learning algorithm L which has the following property. Given as input a 
set S G y™ of m examples chosen according to tt™ and labeled according to an unknown function 
g G G, L outputs a hypothesis g' G G such that if m> mo{d, e, S) where 

mo{d,e,S) = cq log ^ + ^ log^ 

for some constant cq > 0, then with probability at least 1 — 6 over the random samples, 

^n{y)-\h'{y)-h{y)\ < e. 
yey 

Following is a very fundamental quantum information theoretic fact shown by Holevo [Hol73]. 

Theorem 8 (The Holevo bound [Hol73]). Let X be classical random variable taking values in 
X . Let M he a correlated quantum system and let Y be a random variable obtained by performing a 
quantum measurement on M . Then, 

I{X : Y) < I{X : M). (7) 

Following is an interesting and useful information theoretic fact first shown by Helstrom [Hel76] . 

Theorem 9 ([HeI76]). Let XQ be joint classical-quantum system where X is a classical boolean 
random variable. For a G {0, 1}, let the quantum state of Q when X = a be pa- The optimal success 
probability of predicting X with a measurement on Q is given by 

i + i-||Pr[X = 0]po-Pr[X = l]pi||i. 



2.2 One-way communication 

In this article we only consider the two-party one-way model of communication. Let f (- X x y x Z 
be a relation. The relations we consider are always total in the sense that for every {x, y) G X x y, 
there is at least one z G Z, such that {x, y, z) € /. In a one-way protocol V for computing /, Alice and 
Bob get inputs x G X and y Gy respectively. Alice sends a single message to Bob, and their intention 
is to determine an answer z G Z such that (a;, y, z) G f . In the one-way protocols we consider, the 
single message is always from Alice to Bob. A total function f : X xy ^ Z, can be viewed as a 
special type of relations in which for every [x, y) there is a unique z, such that {x, y, z) G f. A partial 
function is a special type of relations such that for some inputs {x, y), there is a unique z, such that 
{x,y,z) G f and for all other inputs {x,y), {x,y,z) G f,yz G Z. We view a partial function / as a 
function f : X x y ^ ZU {*}, such that the inputs (x, y) for which f{x, y) = * are exactly the ones 
for which {x, y, z) G f,yzG Z. 

Lot us first consider classical communication protocols. Wo let D"'^(/) represent the deterministic 
one-way communication complexity, that is the communication of the best deterministic protocol 
computing / correctly on all inputs. For e G (0, 1/2), lot /i be a probability distribution on X x y. 
We let Dl'^{f) represent the distributional one-way communication complexity of / under fi with 
expected error e, i.e., the communication of the best private-coin one-way protocol for /, with 
distributional error (average error over the coins and the inputs) at most e under /x. It is easily noted 
that D;'^''^(/) is always achieved by a deterministic one;- way protocol, and will henceforth restrict 
ourselves to deterministic protocols in the context of distributional communication complexity. We 
let Rl'^'^^if) represent the public-coin randomized one-way communication complexity of / with 
worst case error e, i.e., the communication of the best public-coin randomized one-way protocol for 



/ with error for each input {x, y) being at most e. The analogous quantity for private coin randomized 
protocols is denoted by ^\{f)- The public- and private-coin randomized communication complexities 
are not much different, as shown in Newman's result [New91] that 

R\f) = 0(Ri'f«*(/) +loglog|A'| +loglog|3;|). (8) 

The following result due to Yao [Yao77] is a very useful fact connecting worst-case and distributional 
communication complexities. It is a consequence of the min-max theorem in game theory [KN97, 
Thm. 3.20, page 36]. 

Lemma 4 (Yao's principle [Yao77]). Rl'''"^{f) = max^ Dl'i^if)- 

We define Re'"(/) max^ product ^I'^if)- Note that Re'"(/) could be significantly smaller than 
Ri'P'^'^(/) as is exhibited by the Greater Than (GTn) function for which Ri'P"t'(GT„) = n{n), whereas 
R^D(/)=0(1). 

In a one-way quantum communication protocol, Alice and Bob are allowed to do quantum op- 
erations and Alice can send a quantum message (qubits) to Bob. Given e € (0,1/2), the one-way 
quantum communication complexity (/) is defined to be the communication of the best one-way 
quantum protocol with error at most e on all inputs. Given a distribution on A" x 3^, we can similarly 

define the quantum distributional one-way communication complexity of /, denoted Qe'^(/), to be 
the communication of the best one-way quantum protocol V for / such that the average error of V 

over the inputs drawn from the distribution /i is at most e. We define Qe'" (/) max^ product Qe'^(/)- 

3 A new upper bound on cleissical one-way distributional communication 
complexity 

In this section we present the upper bounds on the distributional communication complexity, D^'^(/) 
for any distribution (possibly non-product) on X xy. We begin by restating the precise result for 
boolean functions. 

Theorem 10. Let f : X x y ^ {0, 1} he a boolean function and let e S (0, 1/2) he a constant. Let 
II he a distribution (possibly non-product) on X x y. Let XY be joint random variables distributed 
according to ji. There is a universal constant k such that, 

Dl'^if) < K-iiogi- Q-/(X:y) + i) -vq/). 

In other words, 

Dl'^if) = OmX:Y) + l).VCif)) 

For showing this result we will crucially use the following fact shown by Harsha, Jain, McAUester 
and Radhakrishnan [HJMR07] concerning communication required for generating correlations. We 
begin with the following definition. 

Definition 3 (Correlation protocol). Let {X,Y) be a pair of correlated random variables taking 
values in X X y. Let Alice be given x £ X, sampled according to the distribution X. Alice should 
transmit a message to Bob, such that Alice and Bob can together generate a value y & y distributed 
according to the conditional distribution Y\x=x; that is the pair {x,y) should have joint distribution 
{X,Y). Alice and Bob are allowed to use public randomness. Note that the generated value y should 
he known to both Alice and Bob. 

Harsha et al. [HJMR07] showed that the minimal expected number of bits that Alice needs to 
send (in the presence of shared randomness), denoted T^{X : Y), is characterized by the mutual 
information I{X : Y) as follows. 



Theorem 11 ([HJMR07]). There exists a universal positive constant I such that, 



I{X : Y) < T^{X : Y) < 4I{X :Y) + l. 

We will also need the following fact. 

Lemma 5. Let m > I be an integer. Let XY be correlated random variables. Let fi^ be the distribu- 
tion ofY\X = X. Let X'Y' represent joint random variables such that X' is distributed identically 
to X and the distribution ofY'\{X' = x) is /if™ (m independent copies of iix)- Then, 

I{X' : y') < m • I{X : Y). 

Proof. Consider, 

I{X' : Y') = S{Y') - E,^x'[S{Y'\X' = x)] 
= S{Y')-m-Bx^x[S{Y\X = x)] 
< m ■ S{Y) - m ■ F,x^x[S{Y\X = x)] 
= m-I{X :Y) 

The second equality above follows from Fact 2 and since X' and X are identically distributed. 
Similarly the first inequality above follows from Fact 2 by noting that Y' is m-copies of Y. 

We are now ready for the proof of Thm. 10. 
Proof of Thm. 10: Let m = mo(VC(/), e/4, e/4) = cq • log ■ (VC(/) + 1) as in Lem. 3. Let 

def 

I be the constant as in Thm. 11. Let c = 4m • I{X ■.Y) + l. We exhibit a public coin protocol V with 
inputs drawn from fj,, in which Alice sends two messages Mi and M2 to Bob. The expected length of 
Ml is at most c and the length of M2 is always at most m. The average error (over inputs and coins) 
of V is at most e/2. Let V' be the protocol that simulates V but aborts and outputs 0, whenever the 
length of Ml in V exceeds 2c/e. From Markov's inequality this happens with probability at most 
e/2. Hence the expected error of V is at most e/2 + e/2 = e. Prom V', we finally get a deterministic 
protocol with communication bounded by 2c/e + m and distributional error at most e. This implies 
our result from definition of D^''^(/) and by setting k appropriately. 

For X G X, let fj.^ be the distribution of Y\X = x. In V, on receiving the input x G X, 
Alice first sends a message Mi to Bob, according to the corresponding correlation protocol as in 
Definition 3, and they together sample from the distribution of /xf™. Let yi,...,ym be the sam- 
ples generated. Note that from the properties of correlation protocol both Alice and Bob know 
the values of yi, . . . ,ym- Alice then sends to Bob the second message M2 which is the values of 
f{x,yi), . . . ,f{x,ym)- Bob then considers the first x' (according to the increasing order) such that 
Vi e [m], f{x' ,yi) = f{x,yi) and outputs f{x',y), where y is his actual input. Using Lem. 3, it is 
easy to verify that for every x € X, the average error (over randomness in the protocol and inputs 
of Bob) in this protocol P will be at most e/2. Hence also the overall average error of P is at most 
e/2. Also from Thm. 11 and Lem. 5, we can verify that the expected length of Mi in P will be at 
most 4m • I{X :Y)+l. □ 

Following similar arguments and using Thm. 7 and Thm. 11, we obtain a similar result for 

non-boolean functions as follows. 

Theorem 12. Let k > 2 be an integer. Let f : X x y ^ [k] be a non-boolean function and let e e 
(0, 1 /6) be a constant. Let f : X xy [0,1] be such that f'{x, y) = f{x, y)/k. Let ji be a distribution 
(possibly non-product) on X xy. Let XY be joint random variables distributed according to fi. There 
is a universal constant k such that, 

1,4/1 JZ,\ 

Dlfif) < '^■^■(^og-+d log' — j • (7(X : y ) + log fc) 
where d^= P ^2 (f) is the -^!f^ -pseudo-dimension of f. 



Proof. Let m = mo(d, e/k, e) = cq log \ + ^ log^ as in Thm. 7. Let I be the constant as in 

def 

Thm. 11. Let c = 4m • I{X -.Y) +1. We exhibit a public coin protocol V for /, with inputs drawn 
from ^, in which Alice sends two messages Mi and M2 to Bob. The expected length of Mi is at most 

c and the length of M2 is always at most 0(m log k). The average error (over inputs and coins) of V 
is at most 2e. Let V' be the protocol that simulates V but aborts and outputs 0, whenever the length 
of Ml in V exceeds c/e. From Markov's inequality this happens with probability at most e. Hence 
the expected error of V' is at most 2e + e = 3e. From V' , we finally get a deterministic protocol with 
communication bounded by c/e+ 0{m\ogk) and distributional error at most 3e. This implies our 
result from definition of D3^'^(/) and by setting k appropriately. 

In V , Alice and Bob intend to first determine f'{x,y) and then output kf'{x,y). For x E X, let 
be the distribution of Y\X = x. On receiving the input x £ X, Alice first sends a message Mi 
to Bob, according to the corresponding correlation protocol as in Definition 3, and they together 
sample from the distribution of /i®™. Let yi, . . . ,ym be the samples generated. Alice then sends to 
Bob the second message M2 which is the values of f'{x, yi), . . . , f'{x, ym) ■ Bob then considers x' as 
obtained from the learning algorithm L (as in Thm. 7) and then outputs kf'{x',y), where y is his 
actual input. Therefore from Thm. 7, with probability 1 — e over the samples yi,. . . , ym, 

^7r(t/)-|/V,y)-/'(^,y)l < e/fc. (9) 

yey 

Note that, {f'{x',y) 7^ f'{x,y)) \f'{x',y) - f'{x,y)\ > 1/k. Hence for samples yi,...,ym, for 
which (9) holds, using Markov's inequality, we have Pr^,^^^ [/' (a;' , y) ^ f'{x,y)] < e. Therefore, for 
any fixed x, the error of V is at most 2e and hence also the overall error of V is at most 2e. 

From Thm. 11 and Lem. 5, we can verify that the expected length of Mi in V will be at most 
4m ■ I{X : Y) + I. The length of M2 is at most 0(m log A;), since using a prefix free encoding each 
f'{x,yi) can be specified in 0(logA;) bits. 

4 A new lower bound on quantum one-way distributional communication 
complexity 

In this section we present our lower bound on the quantum one-way distributional communication 
complexity of a function /, in terms of the one-way rectangle bound of /. We begin with a few 
definitions leading to the definition of the one-way rectangle bound. 

Definition 4 (Rectangle). A one-way rectangle R is a set S x y, where S C X. For a distribution 
jj. over X X y, let fiR represent the distribution arising from 11 conditioned on the event R and let 
IJ.{R) represent the probability (under i-i) of the event R. 

Definition 5 (One-way e-monochromatic). Let f C X x y x Z be a relation. We call a dis- 
tribution X on X X y, one-way e-monochromatic for f if there is a function g : y ^ Z such that 
VvxY^x[{X,Y,g{Y))ef]>l-e. 

Definition 6 (Rectangle bound). Let f Q X xy x Z be a relation. For distribution ^ on X xy, 
the one-way rectangle bound is defined as: 

rec^'''(/) min{log2 , , : R is one-way rectangle and 11 r is one-way e-monochromatic} . 
/i(i?) 

The one-way rectangle bound for f is defined as: 

rec,i(/)'=^Uaxrec^''(/). 

We also define, 

recall (/)= max recl'''{f). 

/Lt:product 



We restate our precise rcsiilt here followed by its proof. 

Theorem 13. Let f : X x y ^ Z be a total function and let e G (0, 1/2) be a constant. Let ^ be a 
product distribution on X x y and let rec^'''(/) > 2(log(l/e)). Then, 

Qlk%{f) > ^ • (1 - 2e) • (5(e/2) - 5(e/4)) • {[recl'^if)] - 1). 
If f : X X y ^ Z L) {*} is a partial function then, 

Qyn2.,,^)if) > ^•(l-2e)-^-(Lreci"^(/)J-l). 

We begin with the following information theoretic fact. 

Lemma 6. Let < d < c < 1/2. Let Z be a binary random, variable with min{Pr(Z = 0),Pr(Z = 

1)} > c. Let M be a correlated quantum system. Let Z' be a classical boolean random variable 
obtained by performing a measurement on M such that, Pr(Z ^ Z') < d, then 

I{Z:M) > I{Z:Z') > S{c) - S{d). 

Proof. The first inequality follows from the Holevo bound, Thm. 8. For the second inequality we 
note that S{Z) > S{c) (since the binary entropy function is monotonically increasing in (0,1/2]) 
and from Fano's inequality, Lem. 1, we have S{Z\Z') < S{d). Therefore, 

I{Z : Z') = S{Z) - S{Z\Z') > S{c) - S{d). 

We are now ready for the proof of Thm. 13. 

Proof of Thm. 13: 

For total boolean functions: For simplicity of the explanation, we first present the proof assuming 

/ to be a total boolean function. Let r [rec^'^(/)J or lfecl'^{f)\ — 1 so as to make r even. Let 
V be the optimal one-way quantum protocol for / with distributional error under /x at most e^/4. 
(Although we have made a stronger assumption regarding the error in the statement of the Theorem, 
we do not need it here and will only need it later while handling non-boolean functions.) Let M 

represent the m =^ Q^y^if) qubit quantum message of Alice in V. Let XY be the random variables 
corresponding to Alice and Bob's inputs, jointly distributed according to /x. Our intention is to define 
binary random variables Ti , . . . , such that they arc determined by X (and hence a specific value 
for Ti, . . . , Tr/2 would correspond to a subset of X) and V« G {0, . . . , § — 1}, 

7(M : Ti+i |Ti ...Ti) > (1 - 2e) • (5(e/2) - 5(e/4)). 

Therefore from Fact 3 and the chain rule of mutual information, Eq. (6), we have, 

m > S{M) > I{M : Ti . . . 
r/2-1 

= I{M :Ti+i\Ti...Ti) 

i=0 

>(l-2e).(%/2)-5(e/4)).^. 

This completes our proof. 

We define Ti, . . . , Tr/2 in an inductive fashion. For i e {0, . . . , | — 1}, assume that we have 
defined Ti, . . . ,Ti and we intend to define Tj+i. Let GOODi be the set of strings t G {0, 1}' such 
that Pr(ri, ...,Ti = t)> 2"''. Then, 



Pr(Ti,...,Ti e GOODi) > 1-2"''+' > l-2-''/2-i. 



Let €t be the error of the protocol V conditioned on Ti, . . . ,Tj = t. Note that E[et] is the same 
as the overall expected error of V; hence E[et] < Now using Markov's inequality we get a 

set GOOD2 e {0, 1}^ such that Pr(Ti ...Ti e GOOD2) > 1 - e and Vt G GOOD2, e* < e'^/A. Let 

def 

GOOD = GOODi n GOOD2. Therefore (since r/2 > log(l/e), from the hypothesis of the theorem), 
Pr(Ti ...Tie GOOD) > 1 - 2'''/'^-'^ - e > 1 - 2e. (10) 
For t e {0, 1}' and yey, let 

St,y = min{Pr[/(X, y) =0\{T^...T,= t)], Pr[/(X, y) = 1\{T^ . . .T, = t)]} . 

Also let. €t,y be the expected error of V conditioned onY = y and Ti . . .Tj = t. 

For t i GOOD, we define T^+i|(Ti . . . = t) = 0. Let t e GOOD from now on. Our intention 
is to identify ayt & y, such that Ct^yt < e/4 and St^y^ > e/2. We will then let Ti^i\{Ti . . .Ti = t) 
to be f{X, yt)\{Ti ...Ti = t). Lem. 6 will now imply,' J(M : Ti+^\{T^ ...Ti = t))> S{e/2) - 5(e/4). 
Therefore, 

I{M :Ti+i\Ti...Ti)> PT{Ti...Ti=t)-IiM :Ti+i\{Ti...Ti = t)) 

teGOOD 

> (1 - 2e) • {S{€/2) - S{e/4)) (using Eq. 10) 

and we would be done. 

Now in order to identify a desired yi, we proceed as follows. Since r < recl'''{f); from the 
definition of rectangle bound and given that is a product distribution we have the following. For 
all 5 C A" with fi{S xy)> 2-^ or in other words with Pr[X e 5] > 2"^ 

E^^y[min{Pr[/(X,2/) = 0|Xe5], Pr[/(Xy) = l|XG5]}] > e. (11) 

Note that since t e GOOD, Pr[ri ...Ti = t\> 2-"". Hence (11) implies that E^^^y > e. Now 
using Markov's inequafity and the fact that, y{t,y),6t,y < 1/2, wc get a set GOODt C y such that 
Pr[y G GOODt] > e and Vy G GOODt, St,y > e/2. 

Since t G GOOD, we have et < e^/4. Note that Ct = Ey^Y[£t,y]- Using a Markov argument again 
we finally get a j/t G GOODt, such that et^y^ < e/4. Note that since yt G GOODf , we have St^y^ > e/2 
and we are done. 

For total non-boolean functions: Let f : X x y ^ Z he a total non-boolean function and 
let r be as before. Wc follow the same inductive argument as before to define Ti . . .T,./2- For i G 
{0, . . . , ^ — 1}, assume that wc have defined Ti . . . Ti. As before we identify a set GOOD C {0, 1}* 
with Pr[Ti . . . G GOOD] > 1 - 2e, such that Vt G GOOD, Pr[ri . . .Ti = t] > 2''' and et < e^/S. 
Since r < rec;^'^(/), from the definition of rectangle bound and the fact that fx is product, we have , 
\/SCX with /u(S' X 3^) > 2-'-, 

Ej,^y[max{Pr[/(X,2/)=2|XG5]}] < 1 - e. (12) 

For t G {0, 1}' and y Gy, let et^y be as before and let, 

6t,y = max{Pr[/(X, y) = z\{T, . . .Ti = t)]} . 

For t ^ GOOD, let us define T,+i\{Ti . . .T, = t) to be 0. Let t G GOOD from now on. Note 
that (12) implies Ey^v'[i5f,j;] < 1 — e. Using Markov's inequality we get a set GOODf C y with 
Pr[y G GOODt] > e/2 and Vy G GOODt, 5t.,y < 1 - e/2. Since 'Ey^Y[et,y\ = et < eV^, again using a 
Markov argument we get ayt £ GOODf, such that et,yt < e/4. Since 6t^y^ < 1 — e/2 (and e G (0, 1/2)), 
observe that there would exist a set St^y^ C Z such that, 

min{Pr[/(X,2/f) G Sf,^J(Ti . . .T, = t)],PT[f{X,yt) eZ- 5f,,J(Ti . . .T, = t)]} > e/2. 



Let us now define Ti+i \{Ti...Ti = t) to be 1 if and only if f{X, yt) e St,y, \{Ti...Ti=t) and 
otherwise. Note that since et,yt < e/4, conditioned on Ti . . .Tj = t, there exists a measurement on 
M, that can predict the value of Tj+i with success probability at least 1 — e/4. The rest of the proof 
follows as before. 

For partial non-boolean functions: Let f : X x y ^ ZU {*} be a partial function and let r 

be as before. Let i G {0, . . . , ^ — 1}. Wc follow a similar inductive argument as in the case of total 
non-boolean functions, except for the definition of Tj_)_i|(Ti .. .Ti = t). As before we identify a set 
GOOD C {0, 1}^ with Pr[Ti . . . e GOOD] > 1 - 2e, such that Vi e GOOD, Pr[Ti . . . Tj = > 2"'^ 
and et < / (2 • 15^). Since r < recl''^{f), from the definition of rectangle bound and the fact that fi 
is product, we have the following. For all S* C A" with ^{S x 3^) > 2~'", 

E^^v-[max{Pr[/(X,2/) = (zor*)|Xe5]}] < 1 - e. (13) 

For t e {0, 1}' and y Gy, let ct^y be as before and let 

St,y = max {Pr[/(X, y) = {z or *)|(Ti . . . = t)]} . 

For t ^ GOOD, let us define Ti+i|(Ti . . . = i) to be 0. Let us assume t e GOOD from now on. Let 
GOODj C y he such that Vj/ e GOODt, St^y < 1 — e/2. Using Markov arguments as before we get 

a. yt € GOODf, such that St.y^ < 1 — e/2 and ej.j,^ < (e/15)"' e'. Since St^yt < 1 — e/2 it implies 
Pr[f{X,yt) =*]<! — e/2. Observe now that can wc get a set St^yt Q Z such that, 

min{PT[f{X,yt)eSt,yMn...T,=t)lPr[f{X,yt)eZ-St^y,\iTi...T,^t)]} > e/6. (14) 

Let O be the output of Bob when Y = yt- All along the arguments below we condition onTi . . .T^ = t. 
Note that since Bob outputs some z G Z even if f{x, y) let us assume without loss of generality 

that q Pr[0 e S't.j/J > 1/2 (otherwise similar arguments would hold by switching the roles of 
St^yt and Z — St^y^). Let us define Ti+i to be 1 if {f{X,yt) € St,y^ U {*}) and otherwise. Note that 
Eq. (14) implies'Pr[Ti+i = 1] < 1 - e/6. Now, 

q = Pr[0 e St,y,\{T,+, = 1)] • Pr[r,+i = 1] 
+ PT[OeSt,y, andTi+i =0] 

< Pr[0 e St,y,\{Ti+^ = 1)] • Pr[Ti+i = 1] + e' 

< Pr[0 e 5t,^J(T,+i = 1)] • (1 - e/6) + e' 

This implies, 

Pr[OGS,,y,\{T,+, = l)]>f^ 

> ((7-e')(l + e/6) 

= q + qe/6-e'{l + e/6) 

>g + e/12-e(l + l/12)/(23-15*) (since g > 1/2 and e < 1/2) 

> g + 0.08e 

Let us define O' = 1 iff O e St,yt and O' = otherwise. Then, 

7(M : Ti+i) > I{0' : T,+i) 

= S{0') - Pr[T,+i = 1] • ^(0'|(r.+i = 1)) 
-Pr[T,+i =0] -^(o'Kr.+i =0)) 

> S{q)- S{q + OMe)- S{e') 

> 1 - 5(0.5 + 0.08e) - S(e') 

> 1 - (1 - 2(0.08e)^) - 2(e/15)^ 

> eV300 



The third inequahty above fohows since the function S{p) is concave and monotonicaUy decreasing 
in [5,1]. The fourth inequahty follows from Fact 1. The rest of the proof follows as before. □ 

5 Application: Security of boolean extractors against quantum 
adversaries 

In this section we present a consequence our lower bound result Thm. 13 to prove security of extrac- 
tors against qiiantum adversaries. In this section we are only concerned with boolean extractors. We 

begin with following definitions. 

Definition 7 (Min-entropy) . Let P be a distribution on [N]. The min-entropy of P denoted 
Soo{P) is defined to be — logmax^gfT^] P{i). 

Definition 8 (Strong extractor). Let e € (0, 1/2). Let Y be uniformly distributed on y. A strong 
(fc, e)-extractor is a function h : X y.y ^ such that for any random variable X distributed on 

X which is independent of Y and with Soo {X) > k we have, 

\\h{X,Y)Y -U^Y\\i < 2e, 

where U is the uniform distribution on {0,1}. 

In other words, even given Y (and not X); h{X,Y) is still close (in ii distance) to being a 
uniform bit. 

Let X, Y, h be as in the definition above. Let us consider a random variable M, taking values in 
some set M, correlated with X and independent of Y. Let us now limit the correlation that M has 
with X, in the sense that Vm G M, Soo{X\M = m) > k. Since /i is a strong (fc, e)-extractor, it is 
easy to verify that in such a case, 

\/m€M, \\h{X,Y)Y\{M = m) - U (g)Y\{M = m)\\i < 2e 
^ \\h{X,Y)YM-U(^YM\\i<2e 

In other words, still close (in £1 distance) to being a imiform bit. 

Now let us ask what happens if the system M is a quantum system. In that case, is it still true 
that given M and Y, h{X, Y) is close to being a uniform bit? This question has been increasingly 
studied in recent times specially for its applications for example in privacy amplification in Quantum 
key distribution protocols and in the Quantum bounded storage models [KMR05,KR05,KT08]. 

However when M is a quantum system, the min-entropy of X, conditioned on M, is not easily 
captured since conditioning on a quantum system needs to be carefully defined. An alternate way 
to capture the correlation between X and M is via the guessing probability. Let us consider the 
following definition. 

Definition 9 (Guessing-entropy). Let X be a classical random variable taking values in X . Let 
M be a correlated quantum system with the joint classical- quantum state being pxM — ^"^[^ = 
x\\x){x\ ® px- Then the guessing-entropy of X given M, denoted Sg{X <— M) is defined to be: 

Sg{X ^ M) = - logmax^ Pr(X = a;)Tr(^^p^) 

X 

where the maximum is taken over all POVMs £ {E^ : x e X}. (Please refer to [NCOO] for a 
definition of POVMsj. 

The guessing-entropy turns out to be a useful notion in the quantum contexts. Let /i, X, Y, M be as 
before, where M is a quantum system. Konig and Terhal [KT08] have (roughly) shown that if the 
guessing entropy Sg{X <— M), is at least k, then given M and Y (and not X), h{X, Y) is still close 
to a uniform bit. We state their precise result here. 



Theorem 14. Let e G (0,1/2). Let h : X x y ^ {0,1} be a strong {k,e)- extractor. Let U be the 
uniform distribution on {0,1}. Let YXM be a classical-quantum system with YX being classical 
and M quantum. Let Y be uniformly distributed and independent of XM and, 

Sg{X^M) > fc + logl/e. 

Then, 

\\h{X,Y)YM -U(^YM\\i < 6^/^. 

We show a similar result as follows. 

Theorem 15. Let e € (0,1/2). Let h : {0,1}" x {0,1}" {0,1} be a strong {k,e)- extractor. Let 
U be the uniform distribution on {0,1}. Let YXM be a classical-quantum system with YX being 
classical and M quantum. Let X be uniformly distributed on {0, l}". Let Y be uniformly distributed 
on {0, 1}™ and independent of XM and, 

L{X:M) < b{e)-{n-k). (15) 

Then, 

\\h{X,Y)YM -U (g>YM\\i < 1 - a(e) (16) 
where a(e) 1l=' i • (i - e)' and 6(e) e ■ {S{\ - |) - S{\ - |)). 

Before proving Thni. 15, we will make a few points comparing it with Thm. 14. 

1. Let's observe that if M is a classical system, then 

Sg{X ^ M) = -logE„^M[2-^~(^l^='")] 

< E„^m[^oo(^|M = m) ■ log, 2] 

< ^m^M\Soo{X\M = m)] 

< S{X\M) 

The first inequality follows from the convexity of the exponential function. The last inequality 
follows easily from definitions. This implies, 

I{X:M) = S{X)-S{X\M) < S{X) - Sg{X ^ M). (17) 

So if M is classical, then the implication of Thm. 15 appears stronger than the implication in 
Thm. 14 (although being weak in terms of the dependence on e.) We cannot show the inequality 
(17) when M is a quantum system but conjecture it to be true. If the conjecture is true, Thm. 15 
would have stronger implication than Thm. 14 in the quantum case as well. 

2. The proof of Thm. 14 in [KT08] crucially uses some properties of the so called pretty good 
measurements (PGMs). Our result follows here without using PGMs and via completely different 
arguments. 

3. Often in applications concerning the Quantum bounded storage model, an upper bound on the 
number of qubits of M is available. This implies the same upper bound on L{X : M). If this 
bound is sufficiently small such that it suffices the assumption of Thm. 15, then h could be used 
to extract a private bit successfully, in the presence of a quantum adversary. 

Let us return to the proof of Thm. 15. We begin with the following key observation. It essentially 
states that a boolean function which can extract a bit from sources of low min-entropy has high 
one-way rectangle bound under the uniform distribution. 

Lemma 7. Let e G (0,1/2). Let h : {0,1}" x {0,1}" {0,1} be a strong {k, e)- extractor. Let 

def 

IJL = Un® Um, where Un, Um are uniform distributions on {0, 1}" and {0, 1}™ respectively. Then 



rec\'l^2-ei^) > n — k. 



Proof. Let R = S x {Q, 1}™ be any one-way rectangle where S C {0, 1}" with > 2""+'^ which 
essentially means that \S\ > 2^ . Let X be uniformly distributed on S. This implies that S^^X) > k. 
Let Y be uniformly distributed on {0, 1}™. Since /i is a strong extractor, from Definition 8 we have 
(where U is the uniform distribution on {0, 1}): 

\\h{X,Y)Y -U(g>Y\\i < 2e 
^-Ey^Y[\\h{X,y)-U\\i]<2e 

We note that from Definition 5, above implies that iir is not 1/2 + e monochromatic. Hence from 
the definition of the rectangle bound, Definition 6 we have recjy^-eC*) > n — k. 

We will also need the following information theoretic fact. 

Lemma 8. Let RQ be a joint classical- quantum system where R is a classical boolean random 

variable. For a G {0, 1}, let the quantum state of Q when R = a be pa. Then there is a measurement 
that can be done on Q to guess value of R with probability 5 + 5 ■ \\RQ — U (g) 

Proof. Let us note that 

||i?(3-?7 0Q||i = ||Pr[i? = 0]po-Pr[ii= l]pi||i. 

Now Helstrom's Theorem (Thm. 9) immediately helps us conclude the desired. 

We arc now ready for the proof of Thm. 15. 
Proof of Thm. 15: We prove our result in the contrapositive manner. Let, 

\\h{X,Y)MY -U ®MY\\x > l-o(e). 

Note that this is equivalent to: 

Ey^Y[\\h{X,y)M-U(^M\\,] > l-a(e). (18) 

Let's consider a one-way communication protocol V for h where the inputs X and Y of Alice and 
Bob respectively are drawn independently from the uniform distributions on {0, 1}" and {0, 1}™ 
respectively. Let /i be the distribution of XY. Now let M be sent as the message of Alice in V. 
Note that now (18) along with Lem. 8 implies that the distributional error of V will be at most 

a(e)/2 = I • (| — e)^- Let e' 1/2 — e. Therefore V has distributional error at most e'^/8. Arguing 
as in the proof of Thm. 13 we get that, 

/(X : M) > 1 . (1 - 2e'){S{e'/2) - S{e'/4)) • vec^ih) 

= e-(5(i-|)-5(i-|)).rec};,_,(/^) 

> 6(e) -{n-k) 

The last inequality follows from Lem. 7 since /i is a strong {k, e)-extractor. 

□ 

6 Conclusion 

In the wake of our quantum lower bound result, it is natural to ask whether in the two-way model 
also, there is a similar relationship between quantum distributional communication complexity of a 
function /, under product distributions, and the corresponding rectangle bound. 

Concerning the c;lassical upper bound, a natural question to ask is whether the bound c;ould be 
tightened, specially in terms of its dependence on the mutual information I{X : Y) between the 
inputs, under a given non-product distribution? For example, could it be that for a boolean function 
/ and a distribution fj, on the inputs, D^if) = 0{I{X : F) -|- VC(/))? 
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A 



Let n > 1 be a sufficiently large integer. Let the Noisy Partial Matching (NPMn) function be as 

follows. 



Input: 

Alice: A string x € {0, 1}". 

Bob: A string w G {0, 1}" and a Matching M on [2n] comprising of n disjoint edges. 
Output: 

For a matching M and a string x, let Mx represent the n bit string corresponding to 

def 

the n edges of M obtained as follows. For an edge e = in M the bit included in 

Mx is Xi © Xj, where Xi, Xj represent the i, j-th bit of x. 

Output bit b e {0, 1} if and only if the Hamming distance between strings {Mx) (B 6" 
and w is at most n/3. If there is no such bit b then output 0. 



Now let the non-product distribution /i on inputs of Alice and Bob be as follows. Let Alice be 
given X drawn uniformly from {0, 1}". Let Bob be given matching M drawn uniformly from the set 
of all matchings on [2n]. With probability 1/2, Bob is given w uniformly from the set of all strings 
with Hamming distance at most n/3 from Mx and with probability 1/2, he is given w uniformly 
from the set of all strings with Hamming distance at most n/3 from {Mx) ® 1". Note that in n there 
is correlation between the inputs of Alice and Bob and hence fj, is non-product. Now we have the 
following. 

Theorem 16 ([GKK+07], implicit). Let n> 1 be a sufficiently large integer and let e G (0, 1/2). 
Let NPM„ and ji be as described above. Then, rec^'''(NPM„) = fi{^) whereas Q^'^(NPM„) = 
O(logn). 



